Training a Korean SRL System with Rich Morphological Features
نویسندگان
چکیده
In this paper we introduce a semantic role labeler for Korean, an agglutinative language with rich morphology. First, we create a novel training source by semantically annotating a Korean corpus containing fine-grained morphological and syntactic information. We then develop a supervised SRL model by leveraging morphological features of Korean that tend to correspond with semantic roles. Our model also employs a variety of latent morpheme representations induced from a larger body of unannotated Korean text. These elements lead to state-of-the-art performance of 81.07% labeled F1, representing the best SRL performance reported to date for an agglutinative language.
منابع مشابه
Semantic Role Labeling Systems for Arabic using Kernel Methods
There is a widely held belief in the natural language and computational linguistics communities that Semantic Role Labeling (SRL) is a significant step toward improving important applications, e.g. question answering and information extraction. In this paper, we present an SRL system for Modern Standard Arabic that exploits many aspects of the rich morphological features of the language. The ex...
متن کاملStatistical Dependency Parsing in Korean: From Corpus Generation To Automatic Parsing
This paper gives two contributions to dependency parsing in Korean. First, we build a Korean dependency Treebank from an existing constituent Treebank. For a morphologically rich language like Korean, dependency parsing shows some advantages over constituent parsing. Since there is not much training data available, we automatically generate dependency trees by applying head-percolation rules an...
متن کاملPKU_HIT: An Event Detection System Based on Instances Expansion and Rich Syntactic Features
This paper describes the PKU_HIT system on event detection in the SemEval-2010 Task. We construct three modules for the three sub-tasks of this evaluation. For target verb WSD, we build a Naïve Bayesian classifier which uses additional training instances expanded from an untagged Chinese corpus automatically. For sentence SRL and event detection, we use a feature-based machine learning method w...
متن کاملImproving Chinese Semantic Role Labeling with Rich Syntactic Features
Developing features has been shown crucial to advancing the state-of-the-art in Semantic Role Labeling (SRL). To improve Chinese SRL, we propose a set of additional features, some of which are designed to better capture structural information. Our system achieves 93.49 Fmeasure, a significant improvement over the best reported performance 92.0. We are further concerned with the effect of parsin...
متن کاملExploiting Rich Syntactic Information for Relation Extraction from Biomedical Articles∗
This paper proposes a ternary relation extraction method primarily based on rich syntactic information. We identify PROTEIN-ORGANISM-LOCATION relations in the text of biomedical articles. Different kernel functions are used with an SVM learner to integrate two sources of information from syntactic parse trees: (i) a large number of syntactic features that have been shown useful for Semantic Rol...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2014